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SCREENING METHOD 

The present invention relates to a method of screening a protein for 
involvement in cancer. 

Cancer represents the second highest cause of mortality, after heart disease, in 
most developed countries. Current estimates suggest that one in three Americans 
alive at present will suffer from some form of cancer. 

Many different forms of cancer exist, and it is believed that there are many 
different causes of the disease. Amongst the known causes of cancer are DNA 
damage, for example as a result of exposure to carcinogenic chraucals or radiation, 
and the actions of transforming viruses. It is further recognised that many cancers 
result from aberrant gene expression, for example as a result of abnormal levels of 
gene expression or of expression of mutated or otherwise altered gene products. 

Although many methods for treating cancer exist, there is a well recognised 
need to develop new and improved techniques. Selection of a suitable treatment for 
cancer may be predicated on correct identification of the aetiology of the disease. For 
this reason it is important to id&a&fy the cause of a given patient's cancer. In addition 
to new treatments there is a requirement for new diagnostic tools able to detect 
cancers. 

It is believed that there are likely to be hundreds of genes the aberrant 
expression of which is associated with cancer formation. Since the analysis and 
modulation of such genes has the potential to form the basis of both methods of 
diagnosis and treatment of cancers it is a recognised goal of biomedical research to 
identify genes, and their products, involved in oncogenesis. 

Current strategies for identifying such genes and proteins are primarily based 
upon two strategies. The first is genomic sequencing, in which the entire genomes of 
the most .conmion cancers (such as breast cancer, colon cancer, Ixmg cancer and 
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prostate cancer) are to be sequmced and compared with the normal human genome in 
ordea: to identify mutant genes which may play a role in the development of cancer. 

The second approach is to identify those genes the expression of which is 
altered in any particular cancer. This may be assessed by the analysis of expression 
levels in samples of cancerous tissue from the individual compared to expression 
levels in either normal tissue from the same individxial or control samples taken from 
individuals without cancer. 

Botii these strategies suffer from significant drawbacks. Analysis of 
expression levels does not provide information regarding the presence or absence of 
mutations within the genes being expressed. Strategies for comparative transcript 
expression analysis, such as micro-array profiling, do not provide any information on 
cancer associated point mutations and can prove problematic when dealing with gene 
femily members, which display areas of significant homology. The Human Canca: 
Genome Project will clearly identify cancer-associated point mutations but neither of 
the methodologies discussed wiU provide any direct functional annotation. Thus both 
^jproaches will identify many changes that are not causal and are merely associated 
widi malignant disease. 

Some companies currentiy use protein/protem interaction mapping as a 
general approach or alternatively retroviral expression of random peptides as a 
platform technology, however this technique stiU fails to take the functional 
perspective of the proteins investigated into consideration: 

According to a first aspect of the presait invention there is provided a method 
of screening a protein for involvement in cancer comprising: 

i) exposing the protein to a first viral oncoprotein; 

ii) assaying for interaction of the protein and the first viral oncoprotein; 

iii) exposing the protein to a second viral oncoprotem; and 

iv) assaying for interaction of the protein and the second viral oncoprotein 



wo 03/079021 



PCT/GB03/00990 



3 

wherein interaction of the protein with the viral oncoproteins indicates that the protein 
is involved in cancer. 

In step i) of the invention the first viral oncoprotein is used as a ''bait" to 
identify proteios in a library to which it is capable of binding (referred to as "prey")- 
A protein present in a library to be screened is Ihus exposed to the first viral 
oncoproteia under conditions in which, should the protein represent a target for the 
viral oncoprotein, the protem and viral oncoprotdba will be able to bind to one 
another. Such conditions may, for exanq)le, be produced in a cell hi which an 
mteraction trap may be carried out 

Stqp ii) of the method allows the selection of those proteins screened that have 
bound to the first viral oncoprotein, the targets of liie oncoprotein, based upon the 
interaction of the screened protem and the oncoprotein. This step may also be carried 
out in a suitable interaction trap. 

Steps iii) and iv) of the method represent repetitions of steps i) and ii) 
respectively, save that in steps iii) and iv) the protein to be screened is exposed to the ^ 
second oncoprotein and the relative binding of tiie protem and second oncoproteui 
assessed 

That a screened protein represents a binding partner for both the first and 
second oncoprotems is taken to indicate that the proteui in question is involved m 
cancer. 

The screened protein may be contained within a mixture of proteins that may 
or may not be involved in the aetiology of cancer. 

It is preferred that those proteins that exhibit interactions with the first viral 
oncoprotein are identified, for instance by sequracing, and their identities noted. The 
same Ubrary of proteins may then be exposed to the second viral oncoprotein and the 
identities of those protems interacting with the second oncogene established. 
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Comparison of the proteins interacting wifli the first and second oncoproteins will 
allow the production of *Sveighted" results in which those proteins interacting with the 
greatest number of viral oncoproteins tested represent more favoured targets , for 
further investigation than those interacting with lesser numbers of oncoproteins. 

In an alternative embodiment only those proteins that interact with the first 
viral oncoprotein are exposed to second viral oncoproteins. Preferably the proteia is 
assayed for interaction with as many of the second viral oncoproteins as are available. 
In this case the proteins to be screened may be exposed to the viral oncoproteins 
sequentially. Thus in this embodiment of the invention only those proteins that 
interacted with the first viral oncoprotein would be e^qposed to the secondary viral 
oncoproteins. Thus after the two rounds of screening, proteins that interact with one 
or more of the secondary viral oncoproteins in addition to flie first viral oncoprotein 
would identified. 

Alternatively the screened protems may be exposed to all the viral 
oncoproteins (first and second) in one step. This allows identification of any protein 
targets that interact with more than one oncoprotein firom those screened. 

It will be readily appreciated that the possible number of "rounds" of 
screening that may be perfomied will only be limited by the number of viral 
oncoproteins available to a person effecting the invention. 

Proteins identified by the screen as interacting with viral oncoproteios may be 
investigated for other known iateractions with viral oncoproteins. This may, for 
example, be achieved by studying interactions reported in pubhshed Uterature or in 
relevant databases. 

It will be appreciated that a given protein that shows more than one 
oncoprotein interaction according to tiie mefliod of the invention will represent a good 
candidate for finlher investigation and validation as set out below. 
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It will be recognised that, in Ihe case of multq>le roimds of screening with 
different oncoproteins, those proteins' that exhibit intaactions with a greater number 
of oncoproteins will represent better candidates for further investigation than those 
proteins interacting with a lesser nxnnber of oncoprotrans. The number of interactions 
which a given protein is deemed to exhibit may take into consideration interactions 
reported in. for example, pubUshed scientific literature, in addition to interactions 
identified by mems of, fi>r example screening using interaction tr^s. 

Preferably the cancer is a non-viral cancer. 

The tissue fiom which the protein is derived is preferably a tissue having a high 
proliferative potential, but in which the level of proliferation is low. The tissue preferably 
has a highly complex pattern of gene expression combined with a high edacity for 
proliferation. An indication of such a tissue type may be that its cells possess a relatively 
open chromatin structure, which is itself an indication of promiscuous low level gene 
expression characteristic of undifferentiated stem cdls. It is preferred that the tissue is 
selected fiom the group comprising placenta, cord blood CD34^ haemopoietic stem cells 
and foetal brain. Most preferably the protein is daived firom placenta or cord blood 
CD34* haemopoietic stem cells. 

The protein to be screened is preferably derived firom a cDNA library. 
Alternatively the protein may comprise a whole cell extract or selected proteins 
repressed by a cell type of interest 

cDNA libraries suitable for use in the invention may be derived fix>m any 
mammaUan tissue. It will be appreciated that the cDNA Ubrary is ideally derived 
fiom human tissue when human genes involved in cancer are being screened. 

cDNA libraries for use in the mvention may be produced by any suitable 
method known in the prior art. Examples of suitable methods that may be used for 
the production of cDNA libraries are well known to those skilled in the art. 
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The exposure of the protein to be tested to the selected viral oncoprotdns, and 
the assessment of their interaction, is preferably performed as part of an interaction 
trap. Many forms of interaction trap are known in the prior art, although it is 
preferred to use a yeast two-hybrid interaction trap to put the invention into effect. 

Yeast two-hybrid screening is a strategy for SCTeraiing for proteins that interact 
with a particular protein. Typically a cDNA library is constructed such that candidate 
proteins are expressed as translational fusions with part of a r^orter gene. Yeast cells 
are then co-transfected with a "bait" constnict consisting of the cDNA of interest 
fused in-frame with the other part of the reporter gene. Only if both expressed 
proteins physically interact will the two parts of the reporter gene be sufficiently 
closely associated to generate a signal. 

Most preferably the yeast two-hybrid interaction tr^ to be used may be 
modified as described in Gyuris, J.; Golemis, E.; Chertkov, H and Brent, R. "Cdil, a 
human Gl and S phase protein phosphatase that associates with Cdk2." (Cell. 1993 
Nov. 19; 75 (4): pp. 791-803). The technique described in this paper is incorporated 
herein by reference. It will, however, be appreciated by those skilled in the art that 
any other form of interaction tr^ may be used to put the invention into practice. 
Suitable examples included techniques such as mammalian two-hybrid, bacterial two- 
hybrid or alternatively various types of pull down assay using, for example, an 
immobilised hybrid of the bait protem fused to a tag protdn such as glutathione 
transferase (GST), or any other protein suitable for use for this purpose. 

In order, when using a yeast two-hybrid interaction trap, to most efficiently 
assay for those yeast cells in which both bait and prey protdns are present it is 
necessary to selectively amplify the required yeast cells. This is iadependent of the 
presence, or lack thereof, of interactions between the proteins of interest. In prior art 
yeast two hybrid techniques it is usual to perform this selective amplification using 
agar plates which incorporate a suitable selection medium. TTtie total number of 
transformed yeast cells are inoculated onto the plate which will only permit the 
growth of those cells that contain both the cDNA derived protein and the target 
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protein. Such cells then produce viable colonies that are subsequently transferred to 
other culture plates to allow assessment of interactions. 

Whilst it is advantageous to incorporate such an amplification step into the 
yeast two hybrid screen, there are notable drawbacks in prior art techniques in that it 
is difficult and time consuming to transfer the amphfied colonies from the agar plates 
on which they have been amphfied to the plates on which analysis of interactions will 
be performed. 

We have discovered that by performing the anq)lification step of the yeast two 
hybrid screen in free solution of the suitable selection medium these disadvantages 
may be overcome. 

The ampUfication of the cDNA hT>rary in yeast in free solution provides 
significant advantages both in terms of the ease and speed with which the yeast cells 
that have been positively transformed can be harvested prior to plating. Furthermore 
there is an increase in the efficiency of recovery of the amphfied yeast cells from the 
cultore. 

Previously pubUshed methods of performing yeast two hybrid interaction 
screening use an inoculating loop to directly repHca transfer each single colony of 
cells from the master screen plate to each of the plates used for growth. We have 
found that by spotting a defined aUquot of a colony re-suspended in a suitable diluent, 
such as sterile distiUed water, rather than simple rephca plates produced by direct 
colony transfer from the master plate, a defined number of cells are transferred in a 
precise volume per spot. The consequence of this is that subsequent growth is more 
uniform and can be compared in a much more precise manner. 

Alternatively primary positive colonies can be inoculated mto an appropriate 
96 well microtifre plate and growth amphfied to a uniform suspension. Rephca plates 
can then be made usmg a 96 well stainless steel pin rephcator and the growth of 
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replica plates compared under different reporter gene activation conditions. Again the 
growth is mdfoim since all wells are started from the same amount of innoculum. 

Oncoproteins that may be used according to the present invention may be 
selected from the oncqproteins expressed by any transforming virus. Preferably the 
first and second oncoproteins, to be used according to the present invention axe 
selected firan the group comprising oncogenic human p^oma virus (BP V), such as 
type 6, 16 and 18. E6, E7 and E5 proteins, hepatitis B *'X", hepatitis C "Core", SV40 
large "T* and small 'T*, adenovirus "El A" and **E1B", human T lymphotrophic virus 
types 1 and 2 "Tax" proteins, Epstein Barr virus "LMPl" and "BBNA3", JC virus 
large *T* and small "T". The selected oncoproteins may be used either singly or in 
combinations as described above. • 

Preferably the first viral oncoprotein comprises HPV 16 E6. 

Preferably the second viral oncoprotein comprises Tax. 

In a preferred modification of the invention a protein, identified by the screen 
as being involved in cancer, may fiirther be exposed to another (second) protein, or 
proteins, and an assay conducted for interactions between the two proteins. In such a 
modification proteins identified by their interaction with viral oncoproteins according 
to steps i) to iv) of the first aspect of the invention represent "anchor protein targets" 
that are flien used to identify "secondary protein targets". The secondary protein 
targets identified in this case are those exanq>les of a second cellular protein that 
interact with the primary anchor protein target. Thus dissection of the interactions of 
an anchor protein target involved in cancer enables identification of further secondary 
protein targets that may also be involved in cancer and may represent suitable targets 
for further investigation as set out below. 

Such secondary protein targets may, for instance, represent \xp or downstream 
members of intracellular protein networks in which the anchor protein target is 
involved. The interaction of tiie viral oncoproteins with the anchor protein target 



wo 03/079021 



PCT/GB03/00990 



9 

identify that the oncoproteins are able to influence the activity of the network and 
hence, indirectly, the activity of the secondary proteui target. Examples of such 
networks include signalling pathways and pathways influencing gene regulation. 
Proteins identified in this way will represent targets for further investigation as 
modulators or markers of cancer. 

A protein identified as an anchor protein target rq)resents a suitable target for 
future investigation with respect to its involvement in cancer. However the use of 
these anchor protein targets as a means by which fiirther secondary protein targets 
involved in cancer may be identified is a great benefit provided by the invention. 

So important is this benefit that according to a second aspect of the invention 
there is provided a method of screening a protein sample for protdns that are 
secondary protein targets for viral oncoproteins comprising: 

(a) exposing an anchor protein target identified as a protein involved in cancer 
according to the first aspect of the invention to the protein saniple; and 

(b) assaying for interaction of proteins within the sample with the anchor protein; 

wherein proteins identified by their interaction with anchor protein targets in step (b) 
repr^ent secondary protein targets involved with csac&r. 

Protein samples used according to the second aspect of the invention may 
preferably be derived from the same tissue as the protein identified as being involved 
in cancer. Interactions between the proteins may be tested and assessed by means of 
an interaction trap, preferably by means of an interaction trap as described above. 

Those proteins that are selected by any aspect of the invention as potentially 
involved in oncogenesis may then be sequenced in order that their identity may be 
discovered. Plasmids encoding the protein of interest may be extracted firom cells 
used in the interaction trap and Iheir sequence determined by any suitable sequencing 
method. Suitable means by which the plasmids may be extracted and the sequence 
information obtained will be readily apparent to those dolled in the art 
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Knowledge of the sequence of a protein of interest, or of the gene eacoding 
the protein, will allow searches of relevant databases to be undertakeai in order to 
estabUsh, where possible, the identity of Uie protein or gene. This identity 
information may then be used to investigate other reported interactions of the protein 
in question and also estabUsh the function of the protein. Mbrmation regarding 
known interactions may allow the identification of other members of functional 
pathways as set out above. Infoimation about the function of the protein may be 
useful in identifying the possible mode of action of the protan in oncogenesis, or in 
suggesting suitable means by which activity of the proteni may be modulated, for 
instance by known modulators of a class of proteins of which the protein of interest 
may be a member. 

A preferred embodiment would be the identification of a novel interaction of 
any secondary protein target with any of the viral oncoproteins listed thus picking out 
the particular anchor/secondary protein target pathway as being the target of multiple 
viral insults. Ideally the new secondary protein target would also be present in the 
newly acquired catalogue of anchor protein targets. 

Bioinformatic analysis of proteins identified according to the first or second 
aspects of the invention and also of pathways identified by the screen as bdng 
involved in cancer may be carried out as follows. Genes extracted firom plasmids 
encoding anchor or secondary protein targets can be identified by the use of BLAST 
search of the non-rednndant and "Expressed Sequence Tag" (EST) NCBI nucleotide 
databases. This strategy can also be used to identify other gene family members. 
Expression profiles can be evaluated in silico by the use of SAGE tag virtual Northern 
Blots. Infonnation regarding reported functions and known interactions of targets 
identified by the soreen ais involved m cancer may be contained in scientific 
pubUcations or other pubUc domain databases. These can be accessed via Pubmed, 
Unigene Gene Cards as can information regarding the chromosomal location of 
targets. Chromosomal location can also be ascertained by use of tiie Human Genome 
Gateway BLAT search program which provides infoimation regarding intron/exon 
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boundaries, gene structure, identity of adjacCTt genes and available ESTs together 
with access to gene prediction programs such as ENSEMBL. Chromosomal locations 
can be evaluated for cancer associated amplifications/deletions etc. by the use of the 
Pubmed database. The Pubmed data base can also be used to identify any secondary 
protein targets, previously shown to interact with an anchor protein target under 
investigation. For example searches can be performed incorporating the name of Ihe 
prospective target and that of a suitable interaction trap method. Alternatively the 
names of additional viral oncoproteins can be incorporated as search temis along with 
the names of either anchor protein targets or secondary protem targets identified using 
specific oncoprotein baits, or the genes encoding such targets. Using PubMed all 
prospective "anchor'' and "secondary" targets can be cross referenced with the broad 
fimctional names of categories of protein that are known to be responsive to the action 
of commonly available "drugs". Examples of classes of target protdms for which a 
range of existing drugs are available include proteases, kinases, phosphatases, ion 
channels etc. 

Homology comparison of two sequences can be -ceaxied out using pairwise 
BLAST (NGBf). If the gene, and hence target protein, function is unknown this can 
be evaluated by using protein/protein cross species homology searches to such 
organisms as C. Elegans, D. Melanogaster, S. Cerevisiae, or M: Mvsculus. Cross 
species interlogs can also be used to identify additional potential interactors within 
common pathways. 

Knowledge of the function of a protein of interest may represent another 
criteria by which favoured targets for further investigation or validation niay be 
identified. For example it may be preferred to investigate those proteins of a certain 
class, or for which known modulators exist, as a matter of priority. 

Information regarding the sequence of plasmids encoding those proteins 
identified by the interaction trap as potentially of interest is also useful in discounting 
coding sequences that may give "false positive" results. Such coding sequences may, 
for example, be those located in untranslated regions or other genomic contaminants. 
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In a preferred CTibodinient the method of the invention further comprises a 
vaUdation step. Proteins identified by the method of the invention as involved with 
cancer, or the riucleic acids encoding such proteins, represent suitable "targets" for 
such validation. 

Sudh a validation step may, for example, comprise analysis of expression 
levels of targets identified by the screen Analysis of levels of expression of 
identified targets, and also of mutated forms of identified targets may preferably be 
conducted by means of comparing the degree of expression of gene products, and the 
identity of the products expressed, between samples of cancerous and non-cancerous 
tissues. Preferably such tissue sanq>les will be derived from the same tissue types, 
and most preferably the cancerous and non-cancerous tissue samples will be derived 
from the same individual. 

Analysis of expression of targets identified by the meUiod of screening of the 
invention may also include analysis of expression of mutant forms of targets 
idraitified. By '"mutant forms" we mean any proteins having at least 50% sequence 
homology (i.e. the sequence of the amino acids forming the protein or of the nucleic 
acids encoding the protdn) with the targets identified by the screening method. 
Preferably mut?nt forms of the target will share at least 75% homology with the wild-, 
type target, and most preferably mutant forms of the target share at least 90% 
homology with the wild-type target gene or gene product. 

In an alternative validation step mutant forms of nucldc acids ©acoding target 
proteins identified by the screening method may be evaluated and analysed by any 
suitable technique known in the prior art. For example, such analysis may be 
perfonned by multiplex PGR. Multiplex PGR may, for example, be performed on a 
real time multiplex PGR machine. Alternatively RNA expression and altered splice 
forms can also be evaluated by Northern Blots. Gomparison of expression levels can 
be carried out using matched pair tumour/normal total cDNA array blots such as those 
produced by Glontech. 
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A suitable validation step may also comprise analysis of the presence of point 
mutations in the genes encoding proteins identified by the screening method. A 
suitable method by which such an investigation may be carried out is by assessment 
of denaturing HPLC (Transgenomic Wave analysis) studies comparing cDNA 
amplification products derived &om cancerous and non-cancerous tissue samples of 
the type indicated above. Othar sequencing methods known in the art are also 
suitable for use in analysis of putative mutations in gesnes encoding target proteins. 

Mutations of genes encoding targets identified by the screening method of the 
invention may also be detected by analysis of tiie genomic DNA &om which the 
cDNA referred to above are derived. 

Another parameter that may be investigated in any validation step used may be 
"alleUc loss" or "homozygous loss" among genes encoding target proteins identified 
by the screen of the inventioiL Homozygous loss occurs when a specific gene has 
deletions present in both alleles. Homozygous loss can be detected by PGR of either 
genomic or cDNA. Other suitable techniques that may be employed are Northern or 
Southern blotting. AlleUc loss occurs when only one allele of a gene is deleted. 
Allelic loss can be demonstrated by analytical PGR comparison of genomic DNA 
ftom normal and tumour tissues using an informative single nucleotide polymorphism 
(SNP), identified fiom the public database, or a designed SNP assay. Allelic loss is 
indicated by the presence of a restriction fiagment polymorphism in one allele and not 
the other. 

Validation of the interactions and properties of both target proteios and their 
coding genes may also be investigated in vitro or in vivo. 

For exanq)le genes encoding targets may be cloned into vectors expressing 
detectable "tags" such as the pcDNAVSHis vector. Expression of a gene of interest 
using this vector causes the protein to be produced bearing a small antigenic 'tag" 
protein V5 that fecilit^es the inununo-localisation of the target protein. Thus 
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expression of targets in conjunctioii with such tag proteins is particularly suited to the 
use of iromuno-precipitation studies to investigate the interactions of the target protein 
and other molecules, such as viral oncoproteins or other members of putative 
pathways, in cultured cells. 

Proteins identified according to the first or second aspects of the invention as 
bdng involved with cancer, i.e. to say anchor protein targets and secondary protein 
targets, rq>resent targets for therapeutic or diagnostic intervention. 

Further information as to the effect of over-expression of a protein identified 
according to the first or second aspects of the invention, or alternatively blockade of 
its expression, may be obtained by, for example gene transfer ejqperiments or the use 
of anti-sense or siKNA oligonucleotides. Such experiments may preferably be 
undertaken in wide variety of transfomied or non-transformed cell lines depending on 
the activity that it is sought to assess. Effects on cell growth, contact inhibition, 
altered ability to undergo anchorage independent growth and apoptosis may be 
typically measured. Suitable systems by which gene expression may be induced 
include the Invitrogen GeneSwitch system in which induction of gene expression may 
be brought about by exposure to mifepristone. In vivo analysis of the effects of 
expression of target proteins and their possible interactions may be effected using well 
known techniques such as in vivo tumour xenograft models. 

A protein that is indicated, by the methods of the invention, to be involved 
with cancer may be selected as a target for modulation for therapy for cancer, or as a 
marker for the diagnosis of cancer. 

Once targets have been identified by means of the screening method of the 
invention then compounds that may be used to provide novel cancer therapies based 
upon the manipulation of levels or activity of the target may be found. Such 
compounds may, for instance, be compounds that influence the level of transcription 
of the gene encoding the target, or influence the accumulation or bio-availability of 
the target. 
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The compoxinds may be used to treat cancer as a monotherapy (i.e. use of the. 
compound alone) or in combination with other compounds or treatments used in 
cancer therapy (e.g. chemother^eutic agents, radiotherapy). 

Known procedures, such as those conventionally employed by the 
pharmaceutical industry (e.g. in vivo experimeatation, clinical trials etc), may be used 
to establish specific formulations of medicaments and precise therapeutic regimes 
(such as daily doses of the agents and the frequency of administration) that may be 
used to influence flie expression of identified targets and thereby provide novel 
then^iies for cancers in which the target is implicated. 

Such novel therapies may be used for the purposes of treating an existing 
cancer or may be used as a prophylactic treatment administered to a person believed 
to be at risk of developing such a cancer. 

Once targets have been identified it will be appreciated that such targets may 
be used as the basis for methods for the diagnosis of cancer. Such methods may, for 
example, comprise analysing a ceU sample from a patient for the presence of a mutant 
form of the target and/or altered ejqpression of the wild type target. 

Diagnosis may be effected on a sample from a patimt believed to be suffering 
from cancer, or alternatively to a patient believed to be at risk of developing cancer. 

Diagnosis according to the invention may be effected in order to establish 
whether or not a patient is suffering from cancer. As an altemative diagnosis may be 
effected in order to assess the suitabihty of a specified therapeutic regime for the 
treatment of a patient's disease. 

The sample taken may, for instance, be a tissue biopsy, blood sample or swab. 
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Diagnosis of the cellular levels of either the wild-type or mutant forms of the 
target may be carried out by assessing the level of the product of the target within the 
cell, or alternatively by taking a measurement of the level of transcription of the gene 
encoding the target. The expression of wild-type or mutant products of the target 
may, for example, be assessed through the use of specific binding agents including 
polyclonal and monoclonal antibodies in techniques such as immuno-cytochenxistry, 
immuno-precipitation, immuno-blotting (Westem blottmg) or enzyme linked 
immmo-sorbent assay (BLISA), 

Detection of the wild ^e or mutant form may be directed to either detection of 
the protein, or detection of the genetic material encoding the wild-type or mutant 
protein. 

A preferred e5q)CTimental protocol for ejQfecting the first aspect of the mvention is 
as follows: 

Steps i) to iv) : 

Proteins encoded by a suitable cDNA hbrary, such as one derived from placenta 
or foetal brain, are screened by a yeast two-hybrid methodology (Gyuris et al., supra) 
with selected oncoproteins, such as HPV 16 E6 and 'Tax", as both 5' and V leXA 
fusions. A Qbot (Genetix) is used to facilitate the screening process. The results of 
this screen are a 96 well format panel of yeast clones, which e]q)ress the putative 
protein prey for each viral oncoprotein. Duplicate multiwell plates are moculated and 
screened by blue/white X-Gal selection for interactions between the proteins being 
screened and the viral oncoproteins. 

Sequencing of genes encoding proteins identified by the method of the invention 
as being involved in cancer: 

Colonies exhibiting positive interactions between proteins being screened and the 
viral oncoproteins are picked into 2 ml deep well 96 well plates, expanded in culture 
and cDNA '"prey** inserts amplified from a small aliquot of this culture by the use of 
vector flanking primers. AU preys are preferably PGR sequenced by use of an 
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automated DNA sequencer (eg ABDIOO) and cDNA ms&ts identified by BLAST 
searcli. Alternatively, so called "smash and grab'* yeast plasmid preparations can be 
prepared from each 2 ml deep well yeast culture and used to transform E.Coli KC8 
cells growing on M9 niinimal medium plus essential amino acids, minus tryptophan. 
Colonies obtained after approximately two days contain the tryptophan, auxotroph 
library plasmid that contains the putative prey int^actor. These colonies will either be 
used to PGR amplify the library prey insert directly or plasmid can be amplified by 
expansion of the colony in liquid culture and plasmid purification carried out Both of 
these approaches provide prey material that can be seqeunced as previously described. 

Bioinformatics analysis of genes encoding proteins identified by the method of 
the invention as being involved with cancer: 

This provides the identity of all candidate genes, thereby allowing identification 
of those potential "anchor^' target preys that are common to more than one viral 
oncoprotein and allows grouping into categories according to gene function by the use 
of bio-informatics (eg NCBI data base, PubMed, Human Genome Gateway etc.) It 
also allows elimination of; sibling clones derived from the same par^t cDNA; clones 
derived &om non-coding sequence and out of frame sequence. Those protein preys 
common to more than one oncoprotein, are the first to be selected for target validation 
studies. Some protein preys may not directiy interact with more than one oncoprotein 
but may represent strong candidates since there may be alternative "secondary** 
targets identified within a common pathway that interact with additional viral 
oncoproteins to produce a sunilar oncogenic effect. Thus it is also important to 
establish the possibility of multiple viral oncoprotein involvement in common cellular 
pafliways. The result of this analysis is the identification of putative protein targets 
within an interaction cascade from the protein identified according to the first aspect 
of the invention. 

A preferred protocol for efifecting the second aspect of the invention is as 
follows: 
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The steps i) to iv) identified above were pCTformed to identify anchor protein 
targets for vise in the second aspect of the invention. The following further steps are 
also performed. 

Second round yeast two hybrid screens may be carried out using selected anchor 
protein target as bait The products of these screens represent potential new 
secondary protein targets, which are then compared to the newly identified catalogue 
of anchor protein targets of the diJfferent viral oncogenes studied. These secondary 
protein targets are also functionally evaluated by the use of bio-infijimatics for in 
silico interaction m^ing. On the basis ofthese findings, targets are flien prioritised 
fi>r further validation studies according to "druggabittty" (e.g. ion channel, kinase, 
proteases etc.) and fhs number of viral oncogaies that target a particular interaction 
cascade. If the function of any novel target is unknown it is possible to gain clues to 
this by examining the database of other species such as c. elegans, S. cerevisiae, D. 
melanogaster etc for homologous sequaices since there may be functional data on 
these homologues. Thus a matrix of weighted results of those cellular proteins and 
their pathways and the number of viral oncoproteins that interact with the pathway is 
produced. 

Validation steps that may be used according to the first and second aspects of the 
invention are as follows: 

Validation Phase One: 

Selected "anchor" and "secondary" target genes are fed directly into a detailed 
analysis of transcript expression (Multiplex PGR), mutation (Denaturing HPLC, 
Sequencing), allelic loss etc. in a variety of human cancers and corresponding normal 
tissue types isolated, wherever possible, from the same patient These data also 
provide a basis for distinguishing between targets that are associated with viral life 
cycle and those that are associated with oncogenesis. This phase of the analysis may 
be performed using a real time multiplex PGR machine such as the MX4000 
(Stratageae). 
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Validation Phase Two: 

Based on validation phase one findings, selected targets are entered into a program of 
studies ustQg a variety of cell culture systems. Cloning into *Tag" vectors, transient 
transfection and immuno-precipitation are used to confirm oncogene-target intCTaction 
in mammalian cells. Gene transfer and the use of anti-sense or siRNA 
oligonucleotides mediated gene sileaicing are used to assess the effects of either over 
expression or blocking expression of selected targets in a variety of different 
transformed and non-transfomied cell • lines. Gene silencing experiments are 
particularly valuable since theyidentify potential gain-of-fimction targets that are 
necessary for the malignant chaiacteristics of transformed ceUs. pSwitch transformed 
and non-transformed cell lines may be used with the Gene Switch (Ihvitrogen) 
mifepristone inducible system for evaluating the effects of stable induced expression 
of targets in vitro. The same system can be used for controlled gene expression with a 
murine in vfvotumour xenograft modeL 

The invention will now be described, by way of example only, with referenc-e 
to the accompanying example and figure in which: 

Figure 1 represents the results of probing a panel of pair matched cDNAs firom 
normal and tumour tissue using radioactively labelled TIP-1 coding sequence. 
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EXAMPLE. 

Yeast two-hybrid selection, as described by Gyuris et al., supra^ was used to identify 
proteins derived from a human placenta cDNA library that bound to the viral 
oncoproteins human papilloma virus type 6 E6 and human T-cell leukaemogenic viras 
'Tax'\ 

Yeast cells expressing proteins derived from the placenta cDNA library were firist 
exposed to HPV 16 E6 and those clones expressing human proteins tihiat interacted with 
the oncoproteins noted. The same panel of yeast cells eiqpressing the cDNA library were 
then assayed for int^actions with *Tax", and those clones that interacted with the 
oncoprotein noted^ Comparison of the lists of interacting clones produced allowed those 
clones that encoded human proteins that reacted to both oncoproteins to be identified 

The gene encoding one of those placenta derived proteins that exhibited interactions with 
both viral oncoproteins was sequenced after prey plasmid extraction and found to have a 
sequence as follows (Sequence ID No. 1): 

AGGGGCGCTC CQ6CCAGTGA TTGGCTGGAG GTTTGTTAAC TATTCATGAG GGGGC6GGCC 
61 GAGCGGG6CG GCCTTTGTTA AGCAGCGAGG GCGCGACCGC GGGTACTCTG CTGCCG6CTT 
121 CTOGGAGCGG. OGCTGGGCGA CCAGAGCAGG GTC6AGATGT CCTACATCCC GGGCCAGCCQ 
181 GTCACCGCCG TGGTGCAAAG AGTTGAAATT CACAAGCTGC GTCAAGGTGA GAACTTAATC 
241 CTGqGTTTCA GCATTGGAGG TGGAATCGAC CAGGACCCTT CCCAGAATCC CTTCTCTGAA 
301 GACAAGACGO ACAAGGGTAT TTATGTCACA CX3GGTGTCTG AAGGAGGCCC TGCTGAAATC 
361 GCTGGGCTGC AGATTGQAGA CAAGATCATG CAGGTCAACG GCTGGGACAT GACCATGGTC 
421 ACACACGACC AGGCCC6CAA GCGGCTCACC AAGCGCTCGG AGGAGGTGGT GCGTCTGCTG 
481 GTGACGCGGC AGTCGCTGCA GAAGGCCGTG CAGCAGTCCA TGCTGTCCTA GCA6CCACCA 
541 CCATCTGCGA CTCCTGCCTG CCGCCTCTCT GTACAGTAAC GCCACTTCCA CACTCTGTCC 
601 CCATCTGGCT TCTGCTGACC GCTGGGCCCC AGCTCAGAAG GGCTATAGCT GGTCCCAGAG 
661 6CCTGGCCTG GCCTTCCTTC CCTTCTCCCA TCCCTGGCCT GGGGCCTCTG GGACCAGCTT 
721 TCTCTCCTGG ACACCGAGGA TTGGAAATAA GGGCCTGGAG CTGAGTAGTA GCCAGTCTGC 
781 TGTGACCACA GGCTCAG6TC CGACCCTOCT GCTTGGCCAC AGCAGTGGCT GGGCAAGTGG 
841 GAACCACTAT CTCTTGGGAG CCCCCAAAAG CTGGGAAATG CTGGAGGAAC CAGGCCTTTC 
901 CCGCTTTTGC CTGGCTGCAG GGTTCGGCTC CGCCCCTGCC CCCCAGCCCT CGTGTGTCCA 
961 CACC6CAGTO CCTCTGCCCC TCGGGGGACT GGACACACAT CCTGCCAGAG GCGCTACGAA 
1021 GCTTTGCCCA GATGAAGCCA GGTGGGCTCC GCGTTCACTC CCACTCTCCC GAGGGGTGCT 
1081 GGCCTCCCCA GGGTTTGCCT TCTTACGGAT TTAGACGAGG TTCGAGGCTC ACCTATCAGG 
1141 GCAGCTCTCA GGATTGTCAT TTTCCTCTTT GCCTGTCGGT TTAACTTTTG TATTTTTTTA 
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1201 ATCACAAGTT TGATACAJ^AA TCTTTTTATC GTACTCTTTG GA6ATGCCCA TTCTACTTTT 
1261 GAATTTAGCT TTTACTAATT CGCATCTGGA AGCTCAGCAA GTGCACAAGC CTTACTTTGG 
1321 TTACCGTGGA AACCACTGCC GCCCCTCCCC GATGTGGTGC GCTCAATAAA AATGCTGGAA 
1381 TTCAAAAAAA 

Comparison of sequence ID No. 1 with pubKcly available database infoimation revealed 
that sequence ID No. 1 corresponded to the published sequence of the gene (Accession 
number AF028823) encoding the HTLV 'TTax" interacting protein 1 (TIP-1). 

In order to validate the method and identify that TIP-1 is indeed involved in cancer 
full length Tip-1 coding sequence was radioactively labelled by random priming with a 
^^P dCTP and used to screen a panel of 250 matched pairs of normalised total cDNAs 
ficom normal (Left hand column dots in Figure 1) and tumour tissue (EUght hand colunm 
dots Figure 1) from the same individuaL (Clontech). (See Figure !)• This blot covers a 
wide range of human cancer types and the results mdicated that the expression of HP-l 
was iip-regulated ^proximately ten fold in; 28% of (n =14) ovarian carcinoma, 36% of 
(n = 50) breast carcinoma, 33% of (n = 21) hmg carcinoma, 30% of (n = 42) in uterine 
carcinoma and 50% of (n = 6) in thyroid carcinoma. However, analysis of ttie histology 
from these various carcinomas indicated that diBbrent carcinomas, such as lung, had sub- 
classifications within the overall category. Out of 21 lung carcinomas 5 were keratinising, 
of which, non showed any up-regulation of Tip-1. Out of the remaining 16 hmg 
carcinomas, 8 (50%) showed extensive up-regolation of Tip-1 expression. 

These results are consistent with TlP-1 bemg involved in cancer. The fact that TEP-l 
was identified by following methods according to the first aspect of the invention 
validates the claimed method. 



